Use of residue pairs in protein sequence-sequence and sequence-structure alignments.
نویسندگان
چکیده
Two new sets of scoring matrices are introduced: H2 for the protein sequence comparison and T2 for the protein sequence-structure correlation. Each element of H2 or T2 measures the frequency with which a pair of amino acid types in one protein, k-residues apart in the sequence, is aligned with another pair of residues, of given amino acid types (for H2) or in given structural states (for T2), in other structurally homologous proteins. There are four types, corresponding to the k-values of 1 to 4, for both H2 and T2. These matrices were set up using a large number of structurally homologous protein pairs, with little sequence homology between the pair, that were recently generated using the structure comparison program SHEBA. The two scoring matrices were incorporated into the main body of the sequence alignment program SSEARCH in the FASTA package and tested in a fold recognition setting in which a set of 107 test sequences were aligned to each of a panel of 3,539 domains that represent all known protein structures. Six procedures were tested; the straight Smith-Waterman (SW) and FASTA procedures, which used the Blosum62 single residue type substitution matrix; BLAST and PSI-BLAST procedures, which also used the Blosum62 matrix; PASH, which used Blosum62 and H2 matrices; and PASSC, which used Blosum62, H2, and T2 matrices. All procedures gave similar results when the probe and target sequences had greater than 30% sequence identity. However, when the sequence identity was below 30%, a similar structure could be found for more sequences using PASSC than using any other procedure. PASH and PSI-BLAST gave the next best results.
منابع مشابه
Investigation of Consecutive Separating Arrangements of Bio active Compounds from Black Tea (Camellia sinensis) Residue
Every year lots of black tea (Camellia sinensis (L.) Kuntze) residue will produce in the factories. These residue are unusable whereas the bio active compounds can be extracted and used in the drag and food industries. Due to mentioned problems, this project was conducted years 2011 - 2012 with the aim to make a study on consecutive isolation of all bio active compounds from tea residu...
متن کاملIdentifying sequence-structure pairs undetected by sequence alignments.
We examine how effectively simple potential functions previously developed can identify compatibilities between sequences and structures of proteins for database searches. The potential function consists of pairwise contact energies, repulsive packing potentials of residues for overly dense arrangement and short-range potentials for secondary structures, all of which were estimated from statist...
متن کاملMultiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels.
An algorithm is presented for the accurate and rapid generation of multiple protein sequence alignments from tertiary structure comparisons. A preliminary multiple sequence alignment is performed using sequence information, which then determines an initial superposition of the structures. A structure comparison algorithm is applied to all pairs of proteins in the superimposed set and a similari...
متن کاملProtein sequence-structure alignment based on site-alignment probabilities.
A protein sequence-structure alignment method for database searches is examined on how effectively this method together with a simple scoring function previously developed can identify compatibilities between sequences and structures of proteins. The scoring function consists of pairwise contact energies, repulsive packing potentials of residues for overly dense arrangement and short-range pote...
متن کاملA MODEL FOR THE BASIC HELIX- LOOPHELIX MOTIF AND ITS SEQUENCE SPECIFIC RECOGNITION OF DNA
A three dimensional model of the basic Helix-Loop-Helix motif and its sequence specific recognition of DNA is described. The basic-helix I is modeled as a continuous ?-helix because no ?-helix breaking residue is found between the basic region and the first helix. When the basic region of the two peptide monomers are aligned in the successive major groove of the cognate DNA, the hydrophobi...
متن کاملDPANN: improved sequence to structure alignments following fold recognition.
In fold recognition (FR) a protein sequence of unknown structure is assigned to the closest known three-dimensional (3D) fold. Although FR programs can often identify among all possible folds the one a sequence adopts, they frequently fail to align the sequence to the equivalent residue positions in that fold. Such failures frustrate the next step in structure prediction, protein model building...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Protein science : a publication of the Protein Society
دوره 9 8 شماره
صفحات -
تاریخ انتشار 2000